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ABSTRACT 


With the exponential growth of BigData on the transactional side of the web architecture, the use of NoSQL databases has seen a considerable amount of growth in the 
recent years. Going along, one of the keys aspect that keeps every database developer concerned is the ability to communicate dynamically and exchange data between 
the traditional RDBMS, Hadoop and the NoSQL databases. Moreover, the advancements in the application developments also increases the overhead of the developer 
in writing complex yet optimized queries. There are a number of applications available in the market that focuses on the Migration activities as also we have some 
applications which provides auto complete features to ease the development part. But this again leaves you managing a bulk of tools which is a very tedious task 
working on critical application. In this model, we would be building a central web based graphical tool for exchanging data between MongoDB, Hadoop and RDBMS. 
Additionally, this system will provide the first ever completely optimized EMS with graphical querying features. Now let it be a simple or a complex query or an 
aggregation function or an aggregation Map Reduce, you need not write a single line of code or command to get things done. 
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1. INTRODUCTION 
There exists exist two major problems in the world of databases that we would try 
to take care which are No GUI Tool and No Universal Data Migration Tool. 


1.1 No GUI Tool 

Though we have n number of client tools available for the databases, there is no 
tool which provides a complete GUI for working with data. The user is expected 
to have knowledge of databases and should be well versed with techniques 
involved in fetching and processing the data. This results in the application 
developers spending most of their time with database developers to make sure 
that they have the perfect query based on their specifications. Moreover, with 
more aggregate functions into picture, increases the complexity of writing opti- 
mized methods for processing the data. This makes it even more difficult for 
developers to work with new trending databases like MongoDB which do pro- 
vide extensible features but do not follow the traditional SQL framework. In this 
paper we aim towards building a web based GUI tool which will eradicate the 
necessity of knowing the syntaxes by providing step wise iconic representation 
of different methods for processing the data. Moreover, the integrated library of 
the tool will also provide a facility to auto correct all errors. So now even if you 
are zero in databases but at least you know what you are looking out from your 
data, this tool will enable you to fetch exactly what you need in the most opti- 
mized way without any glitches. 


1.2 No Universal Data Migration Tool 

With Advent of Big Data, managing data using Traditional RDBMS becomes 
really complex and tedious work. It has been observed that all features of 
RDBMS like schema, Tabular Structures, SQL Joins and ACID have turned out 
to be limitations restricting the use of database while handling Big Data. NoSQL 
Databases like MongoDB on the other hand make it relatively much easier to han- 
dle Big Data on the transactional front. As a result of this, in the recent year the 
industry has seen a lot of data migration happening from SQL databases to 
NoSQL databases. This migration however introduces new challenges for the 
developers of handling different tools for exchange of data between different plat- 
forms. Moreover, it also introduces the overhead of transforming the queries 
from one form to another to suit the syntaxes of different data storage platform 
which use different languages for processing the data. Through the implementa- 
tion of this paper we propose to build a tool which will enable easy data migration 
between MongoDB, MySQL and Hadoop. With a complete GUI nature of the 
tools we are looking forward to deploy the migration process with simple click 
functions without actually writing a single line of function, query or any sort of 
code. Also this tool will be aiming at building a connect environment from where 
any query can be converted internally to an equivalent MongoDB function and 
can fetch the required data from the MongoDB databases without the need of 
changing the query implementation from the programming end. 


2. MOTIVATION 

Working with the database clients which have auto complete features enabled, 
does not actually give the privilege of writing efficient queries without actually 
knowing anything out of it. The developer needs to at least know some part of the 
command so that he can leverage the auto complete feature. It is good to use for 
the traditional databases wherein we are well versed with all the options that we 


have got, since we are using the SQL from over 3 decades now. But for new 
emerging databases like MongoDB, the scenario is a bit different. MongoDB 
introduces new set of efficient functions in each of its release which if used in the 
right potential can help processing data in a much optimized way. But because of 
lack of awareness of the new introductions or because of being habituated to use 
the common functions, the developer does not make extensive use of the features 
available. On the other hand, a tool like MS-Paint, does not keep you thinking 
what you can do on a canvas but rather gives you a complete list of available 
options to explore so that you can paint your imagination with getting into the 
complexity of Computer graphics. So if this can be done on a graphical platform 
then why not implement something similar for the database platform to make life 
with databases easier. 


3. TAXONOMY AND TERMINOLOGY 

3.1 NoSQL Databases 

To Handle the problems of Big Data[2] we came up with the family of Databases 
that[4] 

¢ Supports structured and Unstructured Data. 


* Stores Data in Flat File System 
>  Datais Stored in Binary Format. 
¢ Cansupport anything as long as your application can understand it. 


> For Example, you can write the value of the field AGE as either 24 or 
twenty-four or 20+4 


¢ Does not have support to the concept of Data Types. 
* Can Scale Horizontally. 


¢ Use functions to query the data. So that working with Flat Files becomes 
easy. 


¢ Will embed all properties of an entity with one object itself i.e. have embed- 
ded objects. This eradicates the need of maintaining Joins [2] as Joins 
become expensive with the increase in number of Tables. 


These Databases are called as NoSQL Databases, popularly called as Not Only 
SQL Database [4]. These Databases do not follow any properties of the Tradi- 
tional RDBMS. Based on how they store and Retrieve Data, they can be broadly 
classified as: 


A. Key Value Pair Store Databases[2][4] 
¢ Popular in this category is Amazon’s Dynamo. 


B. Columnar Store Databases[4] 
¢ Popular in this category are Google Big Table and Facebook’s Cassandra. 
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C. Document Store Databases[4] 
¢ Popular in this category is 10 Gens [now known as MongoDB University] 
MongoDB. 


D. Graph Based Database[4] 
¢ Popular in this Category is Neo4j. 


3.2 MongoDB 

MongoDB is the leading NoSQL Database available in the Market Today. 
MongoDB is derived from a Latin word called as Humongous, which means enor- 
mous or huge. So from its name itself MongoDB makes a clear statement of 
being enormous with storage as well as processing capabilities 


Popularity of MongoDB: 
¢ MongoDB was awarded as the database of the year 2014 and 2015. With this 
MongoDB became the first ever Database to achieve this feat consecutively 


¢ MongoDB stands out to be the top most ranked Database as compared to all 
NoSQL Databases and ranks 4th as compared to all the Databases available. 


* MongoDB is deployed on the production deployments of more than 70 per- 
cent of the Fortune 500 Companies 


* MongoDB has been recently deployed on to the Small Cap and Mid Cap 
Companies as well. 


Features of MongoDB: 
* MongoDB is distributed, Document Oriented and Open Source Database. 


* Mapped with a SQL Database, Collections in MongoDB resemble to Tables 
in SQL Databases and Documents in MongoDB resemble to Records in SQL 
Databases. 


* MongoDB isan Object Oriented Database that uses JavaScript functions and 
syntaxes to process results out of the Collections. 


* In MongoDB all the records are organized into JSON Documents where 
every Document is treated as an Object. 


* MongoDB can be used for any Domain and any Kind of Application which 
gives it one more advantage overs its counter-parts like Cassandra and 
Dynamo. 


¢ Itis across platform Database that can be installed on any operating system 
may it be Windows or Linux or any other OS. Moreover, only the installation 
part of MongoDB is different for all the OS, but once you get into the Mongo 
Shell, everything is the same irrespective of the backend OS. 


¢ MongoDB has one of the finest integrations with almost all available Pro- 
gramming Languages starting right from the most basic ones like C and C++ 
to languages like Java, PHP, Pearl and Ruby etc. 


*« MongoDB provides Automatic Scaling around a concept called as Sharding. 


¢ It also provides High Performance, High Availability and Automatic 
Failover around a feature called as Replication. 


* MongoDB provides full support to Indexing and implements Aggregation in 
3 different ways of Pipeline, Standalone and Map Reduce. 


¢  Ithas open integrations with popular BI tools like Pentaho and Tableau. 


¢ It has the finest integration with Hadoop, providing the most powerful plat- 
form for Data Analysis. 


4. RELATED WORK 

According to”GirtsKarnitis” and’”GuntisArnicans” in [1] data Migration is a 
combination of 2 steps where in we restructure the data from the source to meet 
the specifications of the target system and secondly initiate the transfer from the 
source to the destination [1]. Several methods including but not restricted to 
Schema Conversion, meta-modelling [4], ETL and automated data migration 
approach deal with these steps [1]. There are 2 levels in which data is available in 
a RDBMS viz. Physical Level and Logical Level [1]. Different tree building algo- 
rithms are used to define the logical hierarchy that exists in the data [1]. 


Gansen Zhao”, ’Weichai Huang”, ’Shunlin Liang” and ”’Yong Tang” in [2] 
define MongoDB in to a Relational Model.The Authors define two levels of 
defining the schema.One of which is the Micro Level in which the Schema of 
MongoDB Collection can be considered to be fixedand the other one is the 
Macro Level in the Schema of MongoDB Collection would vary for any two 
given instances of time. So if we want to define Micro Level we can state that the 
Relational Model of MongoDB is MongoDB Collection maps to Table in the 
Relational Model. The Schema is composed of keys present in the document 
would define the structure of all tuples. Every value will be corresponding to the 
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value ofa key which will be null ifthe key is not present in a document. There can 
also exist Parent-Child relations between tables which can be identified by the 
Primary and Foreign Keys which will define the Keys in the Main Document and 
the Keys for the Sub Documents [4]. 


”Thalheim” and” Wang” [3] state that in order to migrate the data, , one needs to 
have a thorough understanding of the data source such as data availability and 
data constraint since different data sources are designed using different model- 
ling semantics. 


“Aryan BanselHoracio, Gonzalez—V’elez,Adriana E.Chis” proposed a novel 
approach in the paper“Cloud Based NoSQL Data Migration” for the migration of 
data across cloud-based heterogeneous NoSQL repositories, in particular, docu- 
ment, columnar and graph based databases.This research addresses the horizon- 
tal heterogeneity among NoSQL databases because there exists a broad set of 
NoSQL implementations which exhibit different properties and characteristics. 
Importantly, this work discusses the challenges for migrating the data from one 
NoSQL format to another such as assurance of data adaptability, consistency and 


integrity [4]. 


Mongify is a tool that enables data translation from SQL databases to MongoDB 
[9].It provides integration with MySQL, PostgreSQL, SQLite, Oracle, SQL 
Server and DB2 [9].It works well with all versions of MongoDB [9]. 


MongoBooster is a Cross Platform GUI tool for MongoDB working on a shell 
centric platform providing in-place updates and integration with Moment.js [7]. 
It supports the ES6 syntax providing a true intelligence experience [7]. It pro- 
vides support for mongoose like fluent query builder API [7]. API which enables 
you building queries using chaining syntaxes instead of simple JSON Objects 
[7].Writing more concise and readable MongoDB scripts becomes easy with the 
built-in support for block variable scoping, arrow functions and template strings 


[7]. 


RoboMongo does not emulate the shell of MongoDB rather it simply embeds the 
same environment and engine available with the Mongo Shell. It is currently sup- 
ports the MongoDB version 3.2. RoboMongo executes the code in an internal 
VM based on JavaScript rather than simply analyzing the semantics of the code, 
giving the user an auto complete feature at runtime, adding features that cannot 
be obtained otherwise using the static methods [6]. 


NoSQL viewer stands out to be a free GUI based client for NoSQL databases like 
MongoDB, Cassandra, Couch base, Couch DB and Hbase[1][8]. It enables users 
to simultaneously perform CRUD operations from one single platform eliminat- 
ing the overhead of using multiple tools for different databases. It provides easy 
yet powerful, high performance migration functionality between any supported 
Big Data databases [8]. 


MongoDB connector for Hadoop enables the integration between the 2 most 
powerful data storage systems from the OLTP and OLAP sectors [7]. It allows 
the ability to use MongoDB and Hadoop as sources and destinations for data 
transfers [7]. 


Database Master is easy to use database querying, administration and manage- 
ment tolls that provides a consistent user interface with modern styling and inter- 
face [8]. Is simplifies the process of managing, monitoring, querying, editing, 
visualizing and designing relational and NoSQL database systems [8]. It allows 
users to execute extended scripts in SQL, JSON and C Sharp providing all data- 
base objects like tables, views, procedures, packages, columns, indexes and trig- 
gers [8]. 


5 PROPOSED WORK 

This proposed work is supposed to be a blend of 3 Web Enabled GUI Applica- 
tions into one single platform 

¢ Amigration tool that can exchange data between 


> MongoDB and RDBMS 


> MongoDB and Hadoop 


AGUI tool that will provide 

> Dragand Drop elements pertaining to 
> CRUD Operations 

> Aggregation 

> Indexing 

> Replication 


>  Sharding 
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Auto fill syntaxes wherever applicable 
Administrative tool that will provide 

One Click MongoDB Cluster Setup 

Cluster Monitoring 

Replica Set Setup 

Sharded Cluster Management 


Query Transformation 


6 CONCLUSION 


Int 


his way we are looking forward to build a tool that will ease of the pressure 


from the Developers to understand the syntax and will help build better queries. 


Mo 


reover, it will enable quick Migration activities as the tedious load of query 


translation will be take care of. We would be implementing the GUI tool for 
CRUD operations which can be extended further to support aggregations as well 
and the Migration tool that we would be building will support INF which can be 
further extended to support higher hierarchies. 


RE 
1 








eo Onn 


FERENCES 
Girts Karnitisand GuntisArnicans, ‘Migration of Relational Databaseto Document- 
Oriented Database: Structure Denormalization and DataTransformation”. 7th Interna- 
tional Conference on Computational Intelligence,Communication Systems and Net- 
works (CICSyN). 


Gansen Zhao, Weichai Huang, Shunlin Liang, Yong Tang, “Model-ling MongoDB 
with Relational Model”. 2013 Fourth International Conference on Emerging Intelli- 
gent Data and Web Technologies. 


B. Thalhein and Q. Wang, “Data Migration: A theoretical perspec-tive”. Data and 
Knowledge Engineering, vol. 87, pp. 260-278, 2013. 


Aryan Bansel, Horacio Gonzalez Velez, Adriana E. Chis, “Cloud-based NoSQL Data 
Migration”. 2016 24th Euro micro International Conference on Parallel, Distributed, 
and Network-Based Processing. 


F. Matthes and C. Schulz Towards, “An integrated data migration process model”. Soft- 
ware Engineering for Business Information Systems (sebis), 2011. 


www.robomongo.org/ 
www.monogbooster.com/ 
www.mongodb.com/cloud 


www.mongify.com 


International Education & Research Journal [IERJ] 


E-ISSN No: 2454-9916 | Volume: 3 | Issue: 2 | Feb 2017 





